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— /  There  have  been  significant  advances  in  the  art  of  extracting  eigenvalues  and  eigenvec¬ 
tors  in  the  last  15  years.  This  fact  is  not  widely  appreciated. 

Topics  addressed:  the  special  problems  arising  in  Mechanics,  reduction  to  standard 
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The  contents  of  this  report  will  be  a  chapter  in  the  ”  State-of-the-art  Surveys  on 
Computational  Mechanics  "  to  be  published  by  the  American  Society  of  Mechani¬ 
cal  Engineers  and  edited  by  Prof.  A.  K.  Noor. 
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NOMENCLATURE 

We  apologize  for  not  using  familiar  engineering  notations!  practices  but  there  is  a  good  rea¬ 
son.  Few  special  notations  are  needed  in  this  paper  and  so  a  simple  system  (  avoiding  brackets, 
braces,  and  underlines  )  suffices. 


whole  numbers 

j,  ...,  n  (  Fortran  convention  ) 

real  numbers 

a,  0, 

column  vectors 

a,  b,  •  •  *  (  except  i,  ...,  n,  and 

components  of  vectors 

a(l),a(2),  ••• 

matrices  • 

A.B,  ••• 

identity  matrix 

f  . e»  ) 

imaginary  unit 

»  (  not  j  ) 

transpose 

o 1  or  A1 

time  derivative 

u 

norm 

j  j  V  J  j  :=  VPv  a  >/v(lF+“rr~ 

1.  INTRODUCTION 


In  the  previous  State-of-the-Art  survey  there  was  no  chapter  devoted  exclusively  to  eigen¬ 
value  extraction.  There  is  little  point  in  casting  back  to  1950  and  we  shall  confine  our  attention  to 
the  last  decade. 

The  primary  achievement  of  the  1970's  was  the  incorporation  of  robust  and  fairly  efficient 
eigenvalue  solvers  into  general  finite  element  packages  such  as  NASTRAN,  SAP,  ADINA,  .... 
Consequently  the  dynamic  analyses  of  structures  with  about  10s  degrees  of  freedom  (  d.o.f.  ) 
became  routine  activities  all  over  the  engineering  world.  Systems  with  104  d.o.f.  were  analyzed 
and  even  106  d.o.f.  was  uot  a  pipe  dream.  See  }8],  jl0|,  [20|,  (21j,  [22],  [28],  [38]  for  example. 

These  programs  permitted  more  intensive  analyses  of  aircraft  designs  and  so  helped  in  the 
production  of  the  recent  fuel-efficient  jetliners,  (  e.g.  Boeing  767  ). 

In  order  to  set  the  stage  for  the  rest  of  the  paper  some  facts  of  life  must  be  appreciated. 
First,  the  arithmetic  effort  required  to  compute  aH  the  eigenvalues  of  an  nXn  symmetric  matrix 
is  less  than  the  effort  required  to  form  the  product  of  two  such  matrices.  The  only  requirement  on 
n  is  that  it  be  possible  to  hold  two  full  n  Xn  arrays  in  the  fast  memory  of  the  computer.  These 
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matrices  are  said  to  be  small  for  the  given  computer  system.  See  [l],  [2],  and  |6l]. 

The  standard  eigenvalue  problem  for  small  matrices  is  solved  whether  the  matrix  is  sym¬ 
metric  or  not.  The  methods  are  utterly  reliable  and  fast,  see  section  2.  Fortran  programs  are 
available  in  most  computer  centers.  They  can  be  found  in  libraries  called  EISPACK,  IMSL,  and 
NAG,  among  others.  Even  if  an  engineer  wants  only  3  eigenvalues  (  or  eigenpairs  )  it  is  usually 
best  to  find  them  all  when  the  matrix  is  small.  The  largest  computed  eigenvalue  will  be  accurate 
to  almost  working  precision  (  i.e.  13  correct  decimals  if  the  unit  roundoff  is  10~^4 )  and  the  smaller 
eigenvalues  have  the  same  absolute  accuracy  as  the  largest  one.  See  [1].  Programs  for  extracting 
eigenvalues  of  small  matrices  should  be  used  like  the  built-in  subroutines  for  cosine  or  square  root. 
There  is  no  need  for  the  engineer  to  write  his  own  version.  For  small  matrices  it  is  all  right  to 
reduce  the  general  linear  eigenvalue  problem  to  standard  form  explicitly,  using  matrix  multiplica¬ 
tion.  See  Reduction  II  (  with  a  **  0  )  in  section  5. 

For  large  matrices  the  situation  is  different.  Nevertheless  it  is  still  not  advisable  for  an 
engineer  to  consult  a  book  and  try  to  implement  methods  described  there. 

During  the  1980’s  there  was  gradual  acceptance  of  the  fact  that  the  successful  Subspace 
Iteration  method,  on  which  the  1970  eigensolvers  were  often  based,  was  not  the  last  word  in 
efficiency.  See  [5],  [41].  Some  versions  of  the  Lanczos  algorithm  promised  to  be  an  order  of  magni¬ 
tude  more  efficient.  For  example,  a  block  Lanczos  eigensolver,  produced  by  Boeing  Computer  Ser¬ 
vices  under  the  leadship  of  L.  Komzsik,  was  incorporated  into  the  McNeal-Schwendler  NASTRAN 
package  during  the  summer  of  1985.  This  sophisticated  code  does  not  require  that  any  of  the  vec¬ 
tors  it  computes  remain  in  the  fast  store  (  i.e.  virtual  memory  has  been  accepted  ).  This  means 
that  the  program  can  analyze  quite  large  structures  on  fairly  small  computers  (  large  minis,  such 
as  the  VAX  750  ).  See  [26],  [38], 

All  these  eigensolvers  can  be  vectorized  to  some  extent  but  they  do  not  really  exploit  the  full 
power  of  machines  such  as  the  CYBER  205  or  the  CRAY  2.  See  [57|.  The  situation  may  well 
have  been  rectified  by  the  times  these  words  appear  in  print. 

In  the  remaining  parts  of  this  paper  we  will  try  to  convey  ”  the  big  picture  "  of  the  state- 
of-the-art  in  modal  analysis  and  buckling.  We  shall  point  the  reader  to  the  open  literature  for 
more  details  and  justification.  Inevitably  some  valuable  work  will  have  been  omitted  from  the 
references.  Slighted  researchers  are  urged  to  contact  the  author  and  fill  the  gaps  in  his  knowledge. 


2.  TECHNIQUES  FOR  SMALL  MATRICES 

We  give  a  brief  outline  of  the  methods  used  for  small  problems.  If  A  is  symmetric  it  is 
reduced  to  tridiagonal  form  T  (  i.e.  t(j,  k)  =  0  if  \j  —  k  I  >  1  )  by  explicit  orthogonal  similarity 
transformations.  If  eigenvectors  are  wanted  the  transformation  are  accumulated  and  the  cost  of 
this  aspect  dominates  all  the  others.  The  matrix  T  is  reduced  to  diagonal  form  A  by  the  QR  algo¬ 
rithm.  Fewer  than  n8  multiplications  are  required  to  obtain  all  n  eigenvalues.  If  all  the  eigenvec¬ 
tors  are  wanted  the  cost  rises  to  5  n*. 

For  unsymmetric  matrices  the  pattern  is  similar.  The  matrix  B  is  reduced  to  Hessenberg 
form  H  (  h(j,  k)  =*  0  if  j  >  k+l  )  by  explicit  orthogonal  similarity  transformations.  Then  H  is 
reduced  to  block  triangular  form  by  the  QR  algorithm.  The  blocks  on  the  diagonal  are  either 
2x2,  for  each  compiex  conjugate  pair,  or  1X1,  for  each  real  eigenvalue.  The  cost  for  all  eigen¬ 
values  is  about  5  n8  multiplications,  for  eigenvectors  too  the  cost  rises  to  15  n8. 

The  general  linear  eigenvalue  problem  A  —  \B  may  be  solved  without  inverting  A  or  B  by 
use  of  the  QZ  algorithm.  This  reduces  A  to  block  upper  triangular  form  A  and  B  to  upper  tri¬ 
angular  form  B.  The  diagonal  blocks  of  A  are  either  2X2  or  1X1.  The  cost  is  approximately 
15  n8  multiplications.  The  best  reference  for  all  these  algorithms,  and  more,  is  the  EISPACK 
guide,  see  [1,2].  The  reference  for  learning  about  the  computation  of  eigenvalues  is  Wilkinson's 
classic  tome  "  The  Algebraic  Eigenvalue  Problem  "  (  Oxford  Univ.  Press,  1966  ).  It  should  be 
stressed  that  the  understanding  engendered  by  that  book  was  translated  into  action  in  the  ” 
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Handbook  for  Automatic  Computation  :  vol.  II,  Linear  Algebra  ”,  (  Springer-Verlag,  1971  )  and 
the  EISPACK,  NAG,  and  IMSL  program  libraries  are  all  based  on  it. 


3.  PROBLEM  DESCRIPTION 

A  good  place  to  start  is  the  equation  of  motion  of  some  structure.  It  may  be  written 

Mil  +  Cu  +  Ku  —  f(t),  t> 0, 

where  the  vector  a  =  u(f )  represents  the  state  of  the  system  and  M,C,  K  are  square  matrices. 
The  modes  are  those  solutions  of  the  homogeneous  equation  (  i.e.  /  s  0  )  with  the  special  form 
u(t )  ezp(  iwt  )z  (  «2  *»  —1  ),  where  x  is  a  fixed  vector.  Thus  the  natural  frequency  u,  and  x, 
the  mode  shape  vector,  must  satisfy 

(  K  +  iutC  -  u>2M  )x  =  0. 

When  C  =*  0,  each  eigenpair  (  w,  x  )  represents  one  of  the  free  modes  of  vibration  of  a  conserva¬ 
tive  model  of  the  structure. 

The  algebraic  eigenvalue  equation  given  above  is  quadratic  in  u>  but,  when  C  =  0,  is  linear 
in  w2.  The  computation  of  u /  and  x  is  more  or  less  difficult  depending  on  the  structure  of  K,  C , 
and  M.  We  list  some  important  cases  in  increasing  order  of  difficulty.  See  [7],  [14],  [33],  [34]. 

a)  C  *=  0,  K  and  M  symmetric,  positive  definite  (  =*  s.p.d.  ). 

b)  C  =  0,  K  s.p.d.,  M  symmetric,  positive  semi-definite,  and  singular. 

c)  K,  C,  M  all  s.p.d.  (  dissipative  system  due  to  friction  ). 

d)  C  skew-symmetric,  K  and  M  s.p.d.  (  rotating  systems  ). 

e)  C  =  0,  K  s.p.d.,  M  sym.  but  indefinite  (  for  buckling  analysis,  where  A  =  c*/2  is  not  neces¬ 
sarily  real  ).  See  [76], 

f)  C  symmetric,  indefinite,  K  not  symmetric,  M  s.p.d.  (  rotating,  dissipative  structures, 
\  ss*  or  need  not  be  real  ).  See  [66], 

Although  cases  c,  d,  e,  f  are  being  solved  it  is  fair  to  say  that  they  are  still  research  topics  as 
far  as  eigenvalue  solvers  are  concerned.  See  [16],  [28]. 

Key  Features 

Common  to  all  the  cases  listed  above  are  the  following  special  requirements: 

i)  Only  a  few  pairs  (  w,  x  )  are  wanted,  rarely  more  than  100.  For  vibration  analysis,  where  w 
is  always  positive,  there  are  two  standard  demands;  either  all  frequencies  in  a  given  interval 
or  the  m  fundamental  modes  where  1  <  m  <  100. 

ii)  Most  of  the  elements  in  K ,  C,  and  M  are  zero.  When  finite  elements  are  used  to  construct 
the  matrices  then  they  have  a  loosely  banded  appearance,  depending  strongly  on  the  connec¬ 
tions  within  the  structure.  Also  the  number  of  degrees  of  freedom  keeps  growing.  In  the 
1980’s  10s  is  typical  but  10®  is  to  be  expected. 

4.  GENERAL  COMMENTS 

The  special  character  of  these  tasks,  embodied  in  i)  and  ii)  above,  turns  our  attention  away 
from  traditional  methods  that  seek  to  diagonalize  a  symmetric  matrix  by  means  of  a  sequence  of 
similarity  transformations.  See  [79].  Most  similarity  transformations  spoil  the  sparsity  pattern  in  a 
matrix  and  are  not  cost  effective  when  only  20  eigenvalues  out  of,  say,  2000  are  wanted. 

The  preferred  methods  today  are  all  campling  techniques.  They  create  a  matrix  (  or 
better,  a  linear  operator  )  and  apply  it  to  a  sequence  of  carefully  constructed  vectors.  From  these 
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transformed  vectors  the  dominant  eigenvectors  and  their  eigenvalues  can  be  approximated. 

Th«  Inertia  Count 

However  there  is  one  quite  different  technique  that  is  often  used  to  check  on  the  computed 
results.  It  is  an  application  of  Sylvester’s  Inertia  Theorem.  If  K  and  M  are  symmetric,  r  is  any 
real  number,  and  if  Gaussian  elimination,  without  interchanges,  is  performed  on  K  —  tM  to  pro¬ 
duce  the  triangular  factorization 

K  -  tM  -  LDL* 

where  D  is  a  diagonal  matrix  containing  the  pivots  and  L  is  lower  triangular  with  ones  on  the 
diagonal  and  the  multipliers  elsewhere,  then 

'  the  number  of  eigenvalues  of  the  pair  (  K,  M  )  that  are  less  than  r  is  just  the 

number  of  negative  pivots  in  D  *  See  f4,  p  46j 

It  any  pivot  is  tiny  it  is  best  to  change  r  by  say  .001%  and  do  the  factorization  again. 

This  useful,  though  expensive,  facility  is  sometimes  called  a  Sturm  Sequence  count  for  odd 
historical  reasons.  It  is  used  to  check  that  no  eigenvalues  have  been  missed  by  the  algorithm.  See 
16],  [70]. 

The  arrival  of  vector  computers  at  one  end  of  the  range  and  mini-  and  micro-  computers  at 
the  other  has  made  it  virtually  impossible  to  give  simple  cost  comparisons  between  methods.  See 
[75].  Nevertheless  it  often  happens  that  the  application  of  the  operator  (  or  matrix  )  to  a  vector 
dominates  all  other  costs.  So  it  is  not  unreasonable  to  compare  the  number  of  matrix-vector  pro¬ 
ducts  required  for  the  computation  by  various  algorithms.  Data  transfer  is  also  important. 

When  K  and  M  have  narrow  bandwidth  b  the  cost  of  the  triangular  factorization  men¬ 
tioned  above  is  quite  low  (  b2  n  ops  )  and  consequently  each  step  of  inverse  iteration  is  also  cheap. 
If  only  three  or  four  eigenpairs  are  wanted  it  is  quite  attractive  to  find  them  one  at  a  time  using 
combinations  of  zero  finding  techniques.  The  determinant  search  method  of  Bathe  is  a  good 
example  of  this  approach.  See  [7]  for  more  details. 


5.  REDUCTION  TO  STANDARD  FORM 


The  various  problems  described  above  each  involve  at  least  2  matrices.  Yet  the  most  power¬ 
ful  techniques  require  a  single  matrix  or  operator.  As  will  be  explained  below  it  is  clearer  to  use 
the  word  operator  than  matrix.  There  are  three  ways  to  reduce  the  problem 

(  K  -  w2Af  )q  -  0  (  1  ) 

to  standard  form 

(  B  -  \I  )x  —  0  (  2  ) 

Let  M  =  tV  be  the  Cholesky  factorization  of  M.  Here  L  is  a  lower  triangular  matrix. 
Sometimes  L  is  called  the  square  root  of  M  but  this  is  a  bad  abuse  of  language  because  the  square 
root  of  M  is  defined  to  be  the  symmetric  positive  definite  matrix  X  that  satisfies  X 2  =*  M. 

I.  Premultiply  (  1  )  by  t~*  and  seek  Lq  in  place  of  q  to  find 

(  L-'KL-1  -  uj2I  )Lq  =  0  (  3  ) 

Thus  the  operator  B  (  =»  L~lKL~l  )  is  symmetric.  Of  course  M  must  be  positive  definite  so  the 
application  of  (I)  is  limited.  There  is  no  need  to  form  the  matrix  B  explicitly.  See  [43). 

Method  (1)  is  a  poor  one  simply  because  we  want  eigenvalues  u2,  u2,  ...  close  to  0  and  these 
are  much  more  expensive  to  compute  than  those  near  oo. 
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II.  Take  any  convenient  value  a  near  the  wanted  eigenvalues.  The  value  a  *=  0  is  a  common  but 
not  very  effective  choice.  Rewrite  (  1  )  as  follows 

(  K  -  aM  -  (  w2  -  a  )M  )q  -  0, 

( (  w2  -  a  )-1/  -  (  K  -  aM  )~xM  )q  =  0, 


(  (  w2  -  a  )“1/  -  L\K  -  aM  )~XL  )Dq  =  0. 


(  4  ) 


Thus 

B  -  D(  K  -  aM  )~lL. 

Again  there  is  no  need  to  form  B  explicitly.  The  product  x  =  Bv  is  evaluated  in  3  steps:  w  *-  Lv\ 
solve  (  K  -  aM  )u  «=•  w  for  u;  x  «—  Du. 

Any  effective  method  for  solving  (  K  —  <rM  )tt  =  w  may  be  used,  direct  or  iterative.  If  the 
size  of  K  and  M  is  not  too  great  then  direct  methods  are  popular.  This  requires  a  preliminary 
factorization, 

K  -  crM  =*  LDD 


as  described  in  section  4.  See  [19],  [60]. 

Note  that  B  is  symmetric  and  its  eigenvalues  Xt,  X2,  ...,  are  related  to  the  wanted  frequencies 
by 


It  is  some  of  the  largest  eigenvalues  of  B  that  are  wanted  and  these  are  the  easiest  to  compute. 
Note  also  that  M  need  only  be  positive  semi-definite,  L  can  have  some  0  elements  on  its  diagonal. 

The  considerable  cost  of  factoring  K  -  aM  is  handsomely  offset  by  the  rapid  convergence  of 
the  iterative  methods  to  be  described  later. 


III.  It  is  not  necessary  that  B  be  a  symmetric  matrix.  It  is  sufficient  that  it  be  similar  to  a  sym¬ 
metric  matrix.  Consequently  the  last  step  in  deriving  (  4  )  may  be  omitted.  Thus 

(  (  w2  -  <7  )-*/  -{K  -  aM  )~lM  )q  =  0, 


and 


B  =  (K  -  aM  )-1  M 


The  eigenvectors  are  not  changed  at  all. 

There  is  a  price  to  pay  for  this  convenience.  At  one  or  more  places  in  the  algorithms  it  is 
necessary  to  multiply  a  vector  by  A/  whereas  use  of  reduction  II  avoids  this  step.  See  [60j. 

Both  II  and  III  are  satisfactory,  with  a  slight  preference  for  III  because  it  is  simpler.  It  will  be 
shown  in  a  later  section  that  the  algorithms  can  be  written  so  that  there  is  no  need  to  know 
whether  or  not  M  is  singular  or  ill-conditioned.  See  [48]  and  [68], 

The  important  point  is  that  the  user  supplies  a  subroutine  that  returns  Bv  for  any  given 
vector  v.  This  embodies  the  linear  operator  that  sends  v  into  Bv.  Yet  the  matrix  B  is  never 
formed  explicitly.  The  goal  now  is  to  call  that  subroutine  with  a  sequence  of  vectors  v  and  calcu¬ 
late  from  the  output  the  dominant  few  eigenvectors  of  B. 
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SUBSPACE  ITERATION  (  died  »  > 


. -  For. 

h  Europe  M  -  «—  %““‘h0d  Sin,'““e°” 

tunately  it  has  the  same  acronym  SL  throughout  the  engineering  commun- 

Programs  based  on  this  method  are  m  wid«P  *»  ^  effectiveaes8  of  the  Pjograms  comes 

influence  on  engineers  in  the  U.S.A.  - 

Kutishauser’s  devices.  wef  method.  The  power  method  takes  a  s 

The  basic  idea  is  very  simple.  “  *  0  to  produce  the  so-called  Kry  ov  seque 

ing  vector  «  and  keeps  applying  the  operators 

pow',OT""")  ,,B,.BlB,),BVBV- 

„  M  B.v  „tat,  i,  a.  dime.*.  of  .h.  dominant  ,*n»v«to,  o<  B  » 

For  almost  all  v  the  vector  8  «  Point.  ... 

-oo.  „  .  with  m  start™*  vectors,  .be  column,  nt  a 

o X m  matrix  Vt  .  W is  (vWyvW.f. 

The  trouble  with  lormin.  *V«  *.  **  *  tX— ^ 

“t.  F  be  any  n  Xm  ^ 

:i“pie.ee.r^w,tb  implementation  Ill  ol  section  *  in  -*• 

-  ^  ^  '..tloQ  matnc  es  (  often  called  interaction  mntrices  ) 

•  Form  tbemxm  proton  m,trx,^  B  _^w 

a  r  )  _ o  jsi  m  to  find 

a  Solve  the  lull  m  X  m  Eene,ali.ed  eigenvalue  P^le”  *  C  '  ^  * so  ^  ^ _  1,1-1. 

...,  m. 

3.  Form  W  *»  VG. 

The  vectors  . . -  «  *  « 

much  better  approx, motions  tbnn  otbets. 

-  S.  algorithm  is  to  beep  repeatinp  these  Bavieig,*..  approximations. 

Basic  81  Algorithm 

1.  pick  a  full  rank  V10* 


2.  for  k  =  1,  2 . until  satisfied 

a)  obtain  Rayleigh-Ritz  approximations  (  g[k\  6[k > ),  (  g£\  )  from 

b)  compare  <9^  with  6f~^  to  monitor  convergence,  j  =  1 . m 

c)  if  not  satisfied  set 

„j*)  •-*  y(k-i)gjik)  f or  each  ^ad.  j, 
rj*)  :=*  if  Oj  hat  converged. 

For  a  fuller  discussion  of  SI  with  attention  to  more  details  of  implementation  see: 

K.  J.  Bathe  and  coworkers  (7,  Chap.  12];  (8j  for  his  latest. 

A.  Jennings  and  coworkers  [33,  Chap.  10],  and,  in  turn,  [11],  [12],  [31],  (34). 

Both  groups  have  an  engineering  orientation.  The  numerical  analysts  were  mentioned  at  the 
beginning  of  section  6. 

Important  ingredients  to  successful  usage  of  SI  are 

1.  Choice  of  m 

2.  Choice  of  V<°> 

3.  Mechanisms  for  avoiding  convergence  to  already  known  eigenvectors. 

Good  variations  of  SI  have  been  incorporated  into  a  number  of  FEM  packages. 

There  is  only  one  fundamental  criticism  to  be  made  of  SI:  it  may  use  considerably  more  calls 
to  the  operator  subroutine  than  are  necessary.  The  explanation  is  very  simple.  At  every  step  in  SI 
the  approximations  are  overwritten  with  better  ones  in  Vw.  This  overwriting  discards  infor¬ 

mation  and  the  penalty  can  be  significant.  In  fact  it  is  not  necessary  to  increase  storage  costs  to 
achieve  a  substantial  saving  in  computational  effort. 

7.  THE  LANCZOS  APPROACH 

The  power  of  the  Lanczos  algorithm  comes  from  a  fortunate  approximation  property  of 
Krylov  subspaces.  Although  it  is  quite  surprising  it  is  not  difficult  to  state. 

It  is  well  known  that  the  power  sequence  (  see  the  third  paragraph  of  section  ft  for  the  power 
sequence  ),  started  from  a  random  vector,  converges  almost  always  to  the  dominant  eigendirection 
of  the  operator  B.  !n  principle  a  large  number  of  steps  might  be  necessary.  Suppose  we  ask  a 
different  question.  After  k  —  l  steps  of  the  power  method  how  many  eigenvectors  of  B  could  be 
approximated  to  5  decimal  accuracy  by  taking  suitable  linear  combinations  of  our  k  vectors 

v,  Bv,  B2v,  ...,  Bk~l v  ? 

Remember  that  5  decimal  accuracy  in  an  approximate  eigenvector  y  means  10  decimal  accuracy 
in  the  approximate  eigenvalue  9  =  (  ylBy  )/(  y* My  ). 

Now  for  k  =  1  the  answer  is  none  and  for  A:  =  n,  the  number  of  degrees  of  freedom,  the 
answer  is  n.  However  these  extremes  are  not  relevant.  The  amazing  observation  is  that  for  values 
of  k  between  20  and  n/5  the  answer  is  about  k/2.  As  k  increases  the  hit  ratio  improves  but  for 
most  applications  in  Mechanics  it  is  about  50%  or  better.  Consequently  it  should  be  possible  to 
computer  25  eigenpairs  with  only  50  calls  on  the  operator  (  i.e.  50  matrix-vector  products  ).  Yet 
after  50  steps  the  power  method  might  well  have  produced  only  a  3  decimal  approximation  to  a 
single  eigenvector. 

When  one  recalls  that  the  starting  vector  is  chosen  at  random  this  phenomenon  is  remark¬ 
able.  The  eigenvalues  computed  in  this  way  will  be  those  closest  to  the  number  o  that  occurs  in 
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the  definition 

B  **(K  -<tM  )-‘M. 

The  phenomenon  does  not  depend  on  n,  the  number  of  d.o.f. 

The  time  thing  occurs  with  the  block  power  method.  Suitable  combinations  of  aU  the  vec¬ 
tors  in  V,  BV,  B^V,  ...,  B*V  (  i.e.  9m  matrix-vector  products  )  will  usually  produce  3m  accurate 
eigenvectors.  The  block  approximation  is  slightly  weaker  than  the  single  vector  case  which  would 
give  9m/2  eigenvectors. 

The  difficulty  in  trying  to  exploit  the  phenomenon  is  that  the  power  sequence  is  a  hopeless 
one  to  work  with  in  practice;  it  is  nearly  linearly  dependent.  The  Lanczos  algorithm  computes  a 
different  set  of  vectors  that  nevertheless  has  the  same  approximation  property  as  the  power 
sequence.  What  is  even  better  is  that  it  is  not  necessary  to  keep  all  k  Lanczos  vectors  in  the  fast 
memory. 

The  original  attraction  of  the  Lanczos  algorithm  came  from  the  so-called  three  term 
recurrence  relation  presented  as  far  back  as  1950  by  Lanczos  in  [39|.  It  can  be  described  as  fol¬ 
lows. 

Theoretically  the  Lanczos  vectors  are  what  you  get  by  applying  the  Gram-Schmidt  pro¬ 
cedure,  using  the  M-inner  product,  to  the  power  vectors  but  that  is  not  how  they  are  computed. 
Call  them  qlt  q2,  qk.  Then  qt  :=  v  /  ||v||w.  Let  us  consider  the  construction  of  qk+l  from  q 
?2>  •••>  ?*• 

As  in  the  power  method  B  is  applied  to  the  last  term  in  the  current  sequence  to  give  r  := 
B qk.  Now  we  seek  the  part  of  r  that  is  M-orthogonal  to  qu  q2,  ....  qk.  There  is  a  beautiful  result 

that  shows  that  f  is  already  M-orthogonal  to  qx .  qk_2  (  in  exact  arithmetic  ).  So  it  is  only 

necessary  to  execute 

f  .  =>  Bqk, 

r  —r  -  qk_llk  -  qkak 

where 

rlMqk_lt 

ak  r‘Mqk  —  (  r  -  qk_^k  )lMqk 
Then  normalize  r  as  follows 

0k+ 1  :=*  II  r  II M. 

Qk+l  :=5t  r/0k+V 

Unravelling  this  computation  reveals  that 

Bqk  =  qk—i*1k  +  Qkak  +  7*+ i0k+i 

and  this  is  the  three  term  recurrence  relation.  A  little  more  theory  shows  that,  in  exact  arithmetic, 

Ik  =  0k 

so  that  there  is  no  need  to  compute  the  7*.  Theoretically  the  old  Lanczos  vectors  qx,  ...,  qk_ 2  can 
be  sent  off  to  secondary  storage  since  they  are  not  needed  again  until  the  approximate  eigenvec¬ 
tors  are  computed  at  the  end. 

The  coefficients  a(,  go  to  build  up  a  tridiagonal  matrix  Tk  as  indicated, 


a,  fi2 

02  a2  0S 

0S  ■  ■ 

■  ■  0k 

0k  ak 


Tk  = 
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This  matrix  holds  all  the  essential  information.  Notice  how  little  work  there  is  in  computing  qk+1 
compared  with  one  step  of  subspace  iteration  (SI). 

Suppose  k  is  the  last  step.  To  complete  the  calculation  we  proceed  as  follows, 

1)  Compute  all  eigenvalues  and  eigenvectors  of  Tk,  TkS{  =  i=l,  ...,  k  where 

*<*t  =  U*?U)  =  1- 

2)  Select  those  pairs  (  9{,  a(  )  with  the  property  I  a,-(k)  I  <  10-6. 

3)  For  the  values  of  i  found  in  2)  recall  the  Lanczos  vectors  and  compute 

y<  =  +  ?2«i(2)  +  •  •  •  +  ?*«<(*  )• 

The  pairs  (  9{,  y{  )  are  approximate  eigenvectors  with  the  property 

II  By{  —  y(di  ||M  *=■ 

The  6(  will  be  accurate  to  about  10  significant  figures.  More  precisely  there  is  an  eigenvalue  X  of 
B  such  that 

I  X  -  9{  1  <  lO~lo0k+l/ gap 

where  gap  **  min  I  p  —  8{  |  as  min  {  i  9i+l  —  0{  I  ,  I  6t  -  ^,_1  I  }  and  /x  is  the  closest  eigen¬ 
value  of  B  to  9(  except  for  X.  See  [56]. 

The  description  given  above  ignored  the  important  issue  of  detecting  k,  the  right  step  at 
which  to  stop.  See  [26|,  [56]  and  [69] . 

Orthogonality  Loss 

The  preceding  section  may  seem  too  good  to  be  true.  And  it  is.  Finite  precision  arithmetic 
prevents  the  Lanczos  vectors  from  being  M-orthogonal  or  even  close  to  it.  In  Fig.l  we  show  that 
matrix  QlkMQk,  where  Qk  =  (  qu  ....  qk  ).  It  should  be  the  identity  matrix. 

This  defect  gave  the  Lanczos  algorithm  a  bad  reputation  and  only  a  few  engineers  per¬ 
severed  with  it  during  the  1960s.  Sec  (77|  &  [78|.  In  1971  C.  C.  Paige,  in  a  remarkable  thesis,  was 
the  first  to  explain  this  orthogonality  loss  in  a  satisfactory  way.  Once  this  understanding  spread  it 
was  not  difficult  to  find  ways  to  handle  the  problem.  In  particular  he  showed  that  the  original 
algorithm  is  not  unstable;  you  can  extract  all  eigenvalues. 

It  is  beyond  the  scope  of  this  brief  paper  to  expound  Paige’s  contribution.  For  a  text  book 
account  for  numerical  analysts  see  [  59,  Chap.  13  ]  and  for  a  more  detailed,  but  lucid,  presentation 
see  two  papers  by  Paige,  [5 1)  and  [52j. 

What  we  can  say  is  that  there  are  three  different  ways  of  responding  to  orthogonality  loss 
when  executing  the  three  term  recurrence.  The  first,  advocated  by  Paige  himself  (  though  not  in 
the  context  of  Mechanics  ),  has  been  fully  developed  by  Cullum  and  Willoughby,  see  [13[,  [14],  and 
also  Parlett  &  Reid  [55],  The  idea  is  to  keep  the  original  algorithm.  The  result  is  that  the  method 
becomes  truly  iterative,  it  does  not,  stop  at  step  n.  Nevertheless  it  is  possible  to  deduce  the  true 
eigenvalues  of  B  from  the  computed  tridiagonal  T.  However  the  number  of  steps  required  grows 
by  a  factor  of  2  or  3  at  least.  Moreover  eigenvectors  must  be  computed  separately.  These  pro¬ 
grams  seem  best  suited  for  computing  al]^  the  eigenvalues  of  B.  Engineering  development  along 
these  lines  are  [18]  and  [77]. 

The  opposite  extreme  explicitly  applies  the  Gram-Schmidt  process  to  make  Bqj  orthogonal 
to  7i>  <7s,  •••.  <Jj- 2  at  each  step.  Besides  the  cost  of  this  extended  computation  there  is  the  need  to 
access  all  Lanczos  vectors  at  each  step.  However  the  minimal  number  of  steps  will  be  taken.  For 
short  Lanczos  runs  of  about  20  steps  the  cost  of  this  approach  is  not  heavy.  See  [10],  [19],  [20], 
[21],  [23],  [49],  [50],  and  [75], 


Figure  1.  The  loss  of  orthogonality. 
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Legend: 

The  (  i,  j  )  element  indicates  the  inner  product  of  the  ith  and  yth  I.anczos  vectors  computed  by  the  three 
term  reference. 

In  the  matrix  of  Fig. I  an  integer  m  stands  for  the  quantity  10"*e  where  e  is  the  roundoff  unit. 
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A  third  approach  is  based  on  the  observation  that  it  suffices  to  maintain  semi-orthogonality 
to  enjoy  the  main  advantages  of  full-orthogonality.  If  the  unit  roundoff  in  the  arithmetic  unit  is  t 
then  two  vectors  of  norm  one  are  semi-orthogonal  if  their  inner  product  does  not  exceed  >/e  in 
magnitude.  Moreover  semi-orthogonality  among  the  Lanczos  vectors  can  be  maintained  for  about 
1/3  the  cost  of  maintaining  e-orthogonality .  In  particular  the  minimal  number  of  Lanczos  steps 
will  be  taken.  See  (58j,  (69]  and  [71]. 

It  seems  likely  that  in  Mechanics  it  will  be  preferable  to  keep  the  number  of  calls  on  the 
operator  B  to  a  minimum.  For  applications  in  which  the  whole  spectrum  is  wanted  the  first 
approach  is  the  most  attractive. 

To  complicate  the  issue  there  is  the  extra  possibility  of  using  block  versions  of  the  Lanczos 
algorithm.  That  leads  to  the  next  section. 

Block  Lane aoa 

Just  as  the  power  method  is  readily  extended  to  a  block  form  (  namely  subspace  iteration  ) 
so  is  the  Lanczos  recurrence  extended  to  work  with  several  vectors  simultaneously.  A  block  size  I 
is  chosen  and  a  sequence  of  n  XI  Lanczos  matrices  Qlt  Q2,  ....  Qk  is  constructed  to  satisfy 

1.  QjMQj  =  /, 

2.  BQj  —  Qj-iB]  +  QjAj  + 

Here  B{  is  an  I  Xl  upper  triangular  matrix  and  A{  is  I  Xl  symmetric.  The  associated  matrix  Th  is 
no  longer  tridiagonal  but  has  semi-bandwidth  1.  Consequently  ail  calculations  involving  Tk  are 
more  costly  than  with  simple  Lanczos. 

The  original  motivation  for  block  versions  of  Lanczos  was  to  allow  the  detection  of  multiple 
eigenvalues  of  B.  In  exact  arithmetic  the  simple  Lanczos  algorithm  cannot  distinguish  multiplici¬ 
ties  but,  in  practice,  roundoff  error  ensures  that  the  right  number  of  copies  of  an  eigenvalue  will 
be  found,  but  at  the  cost  of  a  few  extra  Lanczos  steps,  provided  that  the  Lanczos  vectors  are  kept 
orthogonal  or  semi-orthogonal.  See  [60j. 

Theoretically  the  simple  Lanczos  algorithm  has  the  best  approximation  properties  for  a 
given  number  of  calls  on  the  operator  B. 

However  the  increasing  complexity  of  modern  operating  systems  and  storage  management 
has  provided  a  powerful  incentive  to  use  block  codes  despite  their  increased  cost  per  step.  If  the 
triangular  factor  L  in  LDL1  =  K  -  <tM  cannot  be  stored  in  the  fast  memory  then  considerable 
transfer  between  primary  and  secondary  storage  is  needed  to  invoke  the  operator  B.  In  such  cases 
it  is  advantageous  to  let  B  operate  on  several  vectors  at  once.  The  precise  number  is  problem  and 
computer  system  dependent. 

We  can  say  that  in  each  case  there  is  a  maximal  number  l  such  that  the  cost  of  applying  B 
to  I  vectors  is  within  10%  of  the  cost  of  applying  B  to  one  vector.  In  a  good  number  of  cases  I  = 
1  but,  in  general  I,  as  defined,  is  the  correct  block  size  -  assuming  that  I  n-vectors  can  be  held  in 
primary  storage  along  with  sections  of  L. 

The  block  Lanczos  program  embedded  in  MSC/NASTRAN  Version  65  permits  the  user  to 
specify  the  block  size.  See  [26], [39],  Block  Lanczos  methods  are  described  in  [25],  [26],  [28],  [45], 
[63],  [67]  &  [69], 

8.  LANCZOS  VERSUS  SUBSPACE  ITERATION  (SI) 

A  comparison  between  two  implementations  is  presented  in  (46]  although  the  experiments 
were  made  in  1980.  Our  only  interest  here  is  to  illustrate  the  approximation  power  that  comes 
from  using  combinations  of  the  power  vectors  instead  of  using  just  the  most  recent  one. 
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A  three  dimensional  building  frame  was  analyzed  by  the  finite  element  method.  There  were 
468  d.o.f.  and  the  60  dominant  mode  shapes  were  to  be  computed.  This  is  more  than  10%  of  the 
spectrum.  The  interesting  question  is  how  many  calls  on  the  operator  subroutine  [  i.e.  solve  ( 
K  —  <tM  )r  **  Mq  for  r  J  are  needed  for  the  computation  of  these  60  eigenvectors. 

Recall  that  S!  has  to  choose  the  subspace  size.  A  well-known  rule  selects  68  but  for  the  given 
application  this  is  inefficient.  We  show  below  in  Table  1  the  number  of  operator  calls  required  by 
several  programs.  • 


Table  It 


q 

No.  of  steps 

Matrix-vector  ops. 

Sl(basic) 

68 

47 

3196 

Sl(accelerated) 

68 

36 

2448 

Sl(accelerated) 

20 

80 

1600 

LANSO 

1 

120 

120 

t  q  m  dimension  of  subspaces  used. 


Several  points  need  to  be  made  in  order  to  see  the  significance  of  Table  1. 

i)  The  program  SI  (accelerated)  incorporates  a  lot  of  clever  tricks  for  improving  the  perfor¬ 
mance  of  SI.  See  [8]  for  the  details.  Consequently  it  is  more  instructive  to  consider  a  straight 
forward  implementation,  namely  SI  (basic),  to  appreciate  the  theoretical  improvement  that 
comes  from  using  a  Krylov  subspace.  Figures  are  not  available  for  SI  (basic)  used  with 
q  =  20  but  we  may  infer  that  at  least  2000  calls  on  the  operator  subroutine  would  be  made. 
Yet  the  bottom  line  of  Table  1  shows  that  only  120  are  needed  by  a  rival  method  starting 
from  a  random  vector. 

ii)  SI  (accelerated)  used  12  different  shifts  (  i.e.  factorizations  of  K  -  oM  )  to  achieve  its 
improvements  whereas  LANSO  used  only  one.  However  LANSO  could  also  benefit  from  use 
of  more  than  one  shift  as  is  shown  in  [20{.  The  benefit  is  in  storage  requirements,  not  calls 
on  the  operator.  For  example,  one  could  use  4  shifts  in  turn  making  about  31  or  32  calls  on 
the  operator  each  time  while  picking  up  15  eigenvectors.  In  this  way  there  only  needs  to  be 
storage  for  35  vectors.  That  means  that  larger  problems  could  be  solved  completely  inside 
the  fast  memory.  The  selection  of  shifts  is  an  interesting  topic.  See  [19]  and  [35]. 

iii)  There  is  a  price  to  be  paid  for  achieving  this  very  small  number  of  calls  on  the  operator  sub¬ 
routine.  It  is  necessary  to  keep  all  the  Lanczos  vectors  somewhere  because  they  are  recalled 
at  the  end  to  accumulate  the  eigenvectors.  It  turns  out  that  they  are  needed  more  often 
than  that  because  of  roundoff  error.  In  the  simplest  implementations  the  new  Lanczos  vector 
is  explicitly  orthogonalized  against  all  previous  Lanczos  vectors  at  each  step.  This  is  overkill. 
Nevertheless  it  is  necessary  to  do  the  orthogonalization  from  time  to  time  about  (  one  step 
in  four,  on  the  average  )  in  order  to  keep  the  number  of  steps  to  a  minimum. 

More  information  on  these  topics  is  contained  in  [19],  [21],  [48|. 

iv)  It  would  be  more  informative  to  compare  SI  with  a  block  Lanczos  code  but  we  are  not  aware 
of  such  a  trial. 


9.  SINGULAR  M 

The  null  space  of  A f  is  also  the  nullspace  of  B  =  (  K  -  aM  )-1Af  (  i.e.  Afv  =  0  if,  and 
only  if,  Bv  —  0  ).  Every  nontrivial  vector  in  this  subspace  .V (B)  may  be  considered  an  eigenvec¬ 
tor  of  B  with  eigenvalue  oo.  Complementary  to  N(B )  is  R(B),  the  range  space  of  B.  It  is  spanned 
by  the  eigenvectors  belonging  to  the  finite  eigenvalues. 
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By  restricting  attention  to  R{B)  the  regular  case  is  recovered.  Of  course  R(B)  is  not  known 
but  in  exact  arithmetic  it  is  easy  for  all  computed  vectors  to  be  kept  in  J?(B).  It  suffices  that  the 
starting  vectors,  for  SI  and  Lanczos,  be  in  R{B).  Unfortunately  rounding  errors  introduce  into  all 
computed  vectors  components  in  N(B).  With  Lanczos  these  components  grow  steadily  as  the 
computation  proceeds  but  this  feature  is  not  at  all  evident  because  the  M-inner  product  is  blind  to 
such  errors.  Engineers  are  often  tempted  to  stay  in  R(B )  by  using  static  condensation,  see  [32], 
but  this  usually  spoils  the  sparsity  structure  of  the  reduced  matrices.  * 

There  is  an  easy  cure.  Each  approximate  eigenvector  y  may  be  projected  back  onto  R(B)  by 
forming  By  and  then  normalizing.  However  this  is  expensive.  If  m  approximate  eigenvectors  are 
computed  using  only  2m  calls  on  B  then  a  further  m  calls  on  B  to  purify  the  eigenvectors  adds 
approximately  50%  to  the  cost. 

Is  there  a  way  to  purify  y  without  making  any  more  calls  on  B  ? 

With  the  Lanczos  algorithm  there  is  a  nice  trick.  See  [48].  It  is  easy  to  explain  and  imple¬ 
ment. 

Let  (  8(,  y{  )  be  an  approximate  eigenpair  with 


kmi 


See  section  7  for  notation.  The  next,  so-far-unused  Lanczos  vector  fy+1  is  available.  It  turns  out 
that 

By{  —  y{8i  +  qf+iPf+MJ) 

to  within  roundoff  terms.  Consequently  it  is  only  necessary  to  replace  y{  by 

5i  :=  Vi  +  Qj+i{  0f+i8iU)/ei  )• 

The  quantity  will  already  .be  known  since  it  provides  an  error  bound  on  y{.  There  is  no 

need  to  have  «,(j)2  <  unit  roundoff  (  from  section  7  )  so  the  correction  to  y{  is  far  from  negligi¬ 
ble.  See  the  Table  in  [48]  for  the  striking  effect. 

This  modification  makes  a  small  improvement  even  when  M  is  invertible.  Consequently  it 
should  be  made  automatically  and  there  is  no  need  for  the  program  to  know  whether  or  not  M  is 
singular.  A  beautiful  result. 


10.  LOOKING  AHEAD 

The  linear  eigenvalue  problem  K  —  \M,  with  real  eigenvalues,  is  in  good  shape  but  there 
are  many  challenges  ahead. 

1.  Efficient,  robust  techniques  for  quadratic  eigenvalue  problems  (  as  indicated  in  Section  3  ). 
Standard  practice  encourages  reduction  to  a  linear  problem,  see  [23]  and  [36]  for  example, 
but  this  may  not  be  optimal.  See  [66]  and  [74]. 

2.  Problems  with  complex  conjugate  pairs  of  eigenvalues  (  as  indicated  in  Section  3  ).  The 
matrices  described  in  this  paper  can  be  extended  to  the  nonsymmetric  case.  See  [30],  [72]  for 
SI  and  [24],  [54]  for  Lanczos.  Also  see  [64], 

3.  Exploiting  vector  computers  to  the  full. 

4.  The  effect  of  parallel  computers. 

That  can  wait  for  the  next  issue  of  State-of-the-Art! 
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